Topic Model Diagnostics:Assessing Domain Relevance via Topical Alignment0.1in(Supplementary Materials)
نویسندگان
چکیده
We focused on InfoVis research due to relevance, scope and familiarity. Analysis of academic publications is one of the common real-world uses of topic modeling (Griffiths & Steyvers, 2004). Our familiarity with the InfoVis community allowed us to contact experts capable of exhaustively enumerating its research areas. InfoVis has a single primary conference, simplifying the construction and analysis of its publications.
منابع مشابه
Topic Model Diagnostics: Assessing Domain Relevance via Topical Alignment
The use of topic models to analyze domainspecific texts often requires manual validation of the latent topics to ensure that they are meaningful. We introduce a framework to support such a large-scale assessment of topical relevance. We measure the correspondence between a set of latent topics and a set of reference concepts to quantify four types of topical misalignment: junk, fused, missing, ...
متن کاملTopic Cropping: Leveraging Latent Topics for the Analysis of Small Corpora
Topic modeling has gained a lot of popularity as a means for identifying and describing the topical structure of textual documents and whole corpora. There are, however, many document collections such as qualitative studies in the digital humanities that cannot easily benefit from this technology. The limited size of those corpora leads to poor quality topic models. Higher quality topic models ...
متن کاملImproved Query Topic Models via Pseudo-Relevant Pólya Document Models
Query-expansion via pseudo-relevance feedback is a popular method of overcoming the problem of vocabulary mismatch and of increasing average retrieval effectiveness. In this paper, we develop a new method that estimates a query topic model from a set of pseudo-relevant documents using a new language modelling framework. We assume that documents are generated via a mixture of multivariate Pólya ...
متن کاملTraffic Scene Analysis using Hierarchical Sparse Topical Coding
Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...
متن کاملA toponym-based dual vector for topical relevance calculation in focused spatial crawling
Focused crawler is a Web crawler that tries to download only pages that are relevant to a given topic of interest (Siemiński 2009, Almpanidis 2011). That is to say, it is necessary for focused crawler to calculate relevance between pages and specific topic (Rungsawang, 2005). Recently, the specific topic involving spatial information especially toponyms such as the topic about the Diaoyu Island...
متن کامل